Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workflow Landing Requests #18807

Merged
merged 14 commits into from
Oct 1, 2024
Merged

Workflow Landing Requests #18807

merged 14 commits into from
Oct 1, 2024

Conversation

jmchilton
Copy link
Member

@jmchilton jmchilton commented Sep 12, 2024

I've added a little CLI tool for generating workflow landing requests URLs. External clients won't need to use this but it is a simple example with an easy to follow format and flow that describes how to use the APIs and generate custom forms with pre-populated data for users.

$ . .venv/bin/activate; PYTHONPATH=lib python lib/galaxy/tool_util/client/landing.py -g http://localhost:8081 -s mycoolthing simple_workflow
Your customized form is located at http://localhost:8081/workflow_landings/0d3862e7-345f-4805-a080-5a2cbc2fc547?secret=mycoolthing

This pairs well with the new {src: "url",...} data input dictionaries and a potential minimal workflow running UI (pictured below).

Screenshot 2024-09-12 at 11 28 54 AM

The UI piece is clearly an MVP - hopefully a tool form UI expert can take it over.

How to test the changes?

(Select all options that apply)

  • I've included appropriate automated tests.
  • This is a refactoring of components with existing test coverage.
  • Instructions for manual testing are as follows:
    1. [add testing steps and prerequisites here if you didn't write automated tests covering all your changes]

License

  • I agree to license these and all my past contributions to the core galaxy codebase under the MIT license.

@jmchilton
Copy link
Member Author

This is interwound with the tool request API that I thought was closer than it was. I'm finding so many corner cases in modeling tools. If this becomes a blocker for anyone I think I could pretty easily pull out the database migration, the workflow landing stuff, the workflow API that adds {src: "url", ... } inputs, and the form input for the URI parameters.

Alternatively I could pull just the tool API & backend enhancements out and push them back into #17393 and we could keep all the modeling enhancements in here and treat this a bit more like the CWL branch where we have models we sync up but keep the plumbing out until it is solid.

Otherwise I will just spend this week trying to work on the models and just spin some smaller stuff out here and there where it makes sense.

@jmchilton jmchilton force-pushed the landing branch 12 times, most recently from 64d2efa to 27359a0 Compare September 20, 2024 14:13
Copy link
Member

@mvdbeek mvdbeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is some seriously cool stuff, thank you @jmchilton!

required: false
desc: |
Workflows launched with URI/URL inputs that are not marked as 'deferred'
are "materialized" (or undeferred) by the workflow scheduler. This might be
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it hard to do this in Celery ? Feels like a feature we can require celery for.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all of that pipeline could be celery-ified - given that this should work as is though that feels like a second iteration. Our workflow handlers "just work" now - they could be ... more decomposed for sure but it is a bigger project than we need for this feature I think. Workflow scheduling handlers never really materialized the way I wanted - it is maybe the most pointless I ever attempted to make something pluggable. I would be up for just scrapping it all for Celery. But again... future project I think.

lib/galaxy/managers/landing.py Outdated Show resolved Hide resolved
lib/galaxy/managers/landing.py Outdated Show resolved Hide resolved
if len(transform) > 0:
dataset_source.transform = transform

sa_session.add(hda)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be a good spot to check if we already have a dataset with same source, hash, transform and owner ? I think workflows with repeated inputs are going to be a thing (think cross product). This is just a comment to myself basically, not required for a first iteration.

lib/galaxy/workflow/scheduling_manager.py Outdated Show resolved Hide resolved
@mvdbeek
Copy link
Member

mvdbeek commented Sep 30, 2024

lib/galaxy_test/api/test_workflows.py::TestWorkflowsApi::test_run_workflow_with_invalid_url_hashes INFO:     127.0.0.1:58086 - "GET /api/users HTTP/1.1" 200 OK
INFO:     127.0.0.1:58087 - "POST /api/users/adb5f5c93f827949/api_key HTTP/1.1" 200 OK
INFO:     127.0.0.1:58088 - "GET /api/tools?in_panel=False HTTP/1.1" 200 OK
multipart.multipart DEBUG 2024-09-30 16:35:52,199 [pN:main,p:62707,tN:Thread-9 (run_in_loop)] Calling on_field_start with no data
multipart.multipart DEBUG 2024-09-30 16:35:52,199 [pN:main,p:62707,tN:Thread-9 (run_in_loop)] Calling on_field_name with data[0:4]
multipart.multipart DEBUG 2024-09-30 16:35:52,199 [pN:main,p:62707,tN:Thread-9 (run_in_loop)] Calling on_field_data with data[5:17]
multipart.multipart DEBUG 2024-09-30 16:35:52,199 [pN:main,p:62707,tN:Thread-9 (run_in_loop)] Calling on_field_end with no data
multipart.multipart DEBUG 2024-09-30 16:35:52,199 [pN:main,p:62707,tN:Thread-9 (run_in_loop)] Calling on_end with no data
INFO:     127.0.0.1:58089 - "POST /api/histories HTTP/1.1" 200 OK
INFO:     127.0.0.1:58090 - "POST /api/workflows/upload HTTP/1.1" 200 OK
galaxy.workflow.run_request INFO 2024-09-30 16:35:52,319 [pN:main,p:62707,tN:AnyIO worker thread] Creating a step_state for step.id 471
galaxy.workflow.run_request INFO 2024-09-30 16:35:52,319 [pN:main,p:62707,tN:AnyIO worker thread] Creating a step_state for step.id 472
galaxy.workflow.run_request INFO 2024-09-30 16:35:52,319 [pN:main,p:62707,tN:AnyIO worker thread] Creating a step_state for step.id 473
galaxy.web_stack.handlers INFO 2024-09-30 16:35:52,320 [pN:main,p:62707,tN:AnyIO worker thread] (WorkflowInvocation[unflushed]) Handler '_default_' assigned using 'HANDLER_ASSIGNMENT_METHODS.DB_SKIP_LOCKED' assignment method
INFO:     127.0.0.1:58091 - "POST /api/workflows/f4df8294d9246e23/invocations HTTP/1.1" 200 OK
INFO:     127.0.0.1:58092 - "GET /api/invocations/f356c15ec7800da0 HTTP/1.1" 200 OK
galaxy.jobs.handler DEBUG 2024-09-30 16:35:52,520 [pN:main,p:62707,tN:WorkflowRequestMonitor.monitor_thread] Grabbed WorkflowInvocation(s): 11
galaxy.workflow.scheduling_manager DEBUG 2024-09-30 16:35:52,523 [pN:main,p:62707,tN:WorkflowRequestMonitor.monitor_thread] Attempting to schedule workflow invocation [11]
galaxy.workflow.scheduling_manager INFO 2024-09-30 16:35:52,553 [pN:main,p:62707,tN:WorkflowRequestMonitor.monitor_thread] Failed to materialize dataset for workflow 11 - HistoryDatasetAssociation <galaxy.model.HistoryDatasetAssociation(34) at 0x32f293820> in state error with null file size, this is not valid
galaxy.workflow.scheduling_manager ERROR 2024-09-30 16:35:52,554 [pN:main,p:62707,tN:WorkflowRequestMonitor.monitor_thread] An exception occured scheduling while scheduling workflows
Traceback (most recent call last):
  File "/Users/mvandenb/src/galaxy/lib/galaxy/workflow/scheduling_manager.py", line 348, in __attempt_materialize
    self.app.hda_manager.materialize(task_request, in_place=True)
  File "/Users/mvandenb/src/galaxy/lib/galaxy/managers/hdas.py", line 200, in materialize
    session.commit()
  File "/Users/mvandenb/src/galaxy/.venv/lib/python3.10/site-packages/sqlalchemy/orm/scoping.py", line 597, in commit
    return self._proxied.commit()
  File "/Users/mvandenb/src/galaxy/.venv/lib/python3.10/site-packages/sqlalchemy/orm/session.py", line 2028, in commit
    trans.commit(_to_root=True)
  File "<string>", line 2, in commit
  File "/Users/mvandenb/src/galaxy/.venv/lib/python3.10/site-packages/sqlalchemy/orm/state_changes.py", line 139, in _go
    ret_value = fn(self, *arg, **kw)
  File "/Users/mvandenb/src/galaxy/.venv/lib/python3.10/site-packages/sqlalchemy/orm/session.py", line 1313, in commit
    self._prepare_impl()
  File "<string>", line 2, in _prepare_impl
  File "/Users/mvandenb/src/galaxy/.venv/lib/python3.10/site-packages/sqlalchemy/orm/state_changes.py", line 139, in _go
    ret_value = fn(self, *arg, **kw)
  File "/Users/mvandenb/src/galaxy/.venv/lib/python3.10/site-packages/sqlalchemy/orm/session.py", line 1288, in _prepare_impl
    self.session.flush()
  File "/Users/mvandenb/src/galaxy/.venv/lib/python3.10/site-packages/sqlalchemy/orm/session.py", line 4352, in flush
    self._flush(objects)
  File "/Users/mvandenb/src/galaxy/.venv/lib/python3.10/site-packages/sqlalchemy/orm/session.py", line 4380, in _flush
    self.dispatch.before_flush(self, flush_context, objects)
  File "/Users/mvandenb/src/galaxy/.venv/lib/python3.10/site-packages/sqlalchemy/event/attr.py", line 378, in __call__
    fn(*args, **kw)
  File "/Users/mvandenb/src/galaxy/lib/galaxy/model/base.py", line 184, in before_flush
    for obj in versioned_objects_strict(session.dirty):
  File "/Users/mvandenb/src/galaxy/lib/galaxy/model/base.py", line 171, in versioned_objects_strict
    obj.__strict_check_before_flush__()
  File "/Users/mvandenb/src/galaxy/lib/galaxy/model/__init__.py", line 5272, in __strict_check_before_flush__
    raise Exception(
Exception: HistoryDatasetAssociation <galaxy.model.HistoryDatasetAssociation(34) at 0x32f293820> in state error with null file size, this is not valid

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/mvandenb/src/galaxy/lib/galaxy/workflow/scheduling_manager.py", line 322, in __monitor
    self.__schedule(workflow_scheduler_id, workflow_scheduler)
  File "/Users/mvandenb/src/galaxy/lib/galaxy/workflow/scheduling_manager.py", line 332, in __schedule
    self.__attempt_schedule(invocation_id, workflow_scheduler)
  File "/Users/mvandenb/src/galaxy/lib/galaxy/workflow/scheduling_manager.py", line 365, in __attempt_schedule
    if not self.__attempt_materialize(workflow_invocation, session):
  File "/Users/mvandenb/src/galaxy/lib/galaxy/workflow/scheduling_manager.py", line 358, in __attempt_materialize
    session.commit()
  File "/Users/mvandenb/src/galaxy/.venv/lib/python3.10/site-packages/sqlalchemy/orm/session.py", line 2028, in commit
    trans.commit(_to_root=True)
  File "<string>", line 2, in commit
  File "/Users/mvandenb/src/galaxy/.venv/lib/python3.10/site-packages/sqlalchemy/orm/state_changes.py", line 139, in _go
    ret_value = fn(self, *arg, **kw)
  File "/Users/mvandenb/src/galaxy/.venv/lib/python3.10/site-packages/sqlalchemy/orm/session.py", line 1313, in commit
    self._prepare_impl()
  File "<string>", line 2, in _prepare_impl
  File "/Users/mvandenb/src/galaxy/.venv/lib/python3.10/site-packages/sqlalchemy/orm/state_changes.py", line 139, in _go
    ret_value = fn(self, *arg, **kw)
  File "/Users/mvandenb/src/galaxy/.venv/lib/python3.10/site-packages/sqlalchemy/orm/session.py", line 1288, in _prepare_impl
    self.session.flush()
  File "/Users/mvandenb/src/galaxy/.venv/lib/python3.10/site-packages/sqlalchemy/orm/session.py", line 4352, in flush
    self._flush(objects)
  File "/Users/mvandenb/src/galaxy/.venv/lib/python3.10/site-packages/sqlalchemy/orm/session.py", line 4380, in _flush
    self.dispatch.before_flush(self, flush_context, objects)
  File "/Users/mvandenb/src/galaxy/.venv/lib/python3.10/site-packages/sqlalchemy/event/attr.py", line 378, in __call__
    fn(*args, **kw)
  File "/Users/mvandenb/src/galaxy/lib/galaxy/model/base.py", line 184, in before_flush
    for obj in versioned_objects_strict(session.dirty):
  File "/Users/mvandenb/src/galaxy/lib/galaxy/model/base.py", line 171, in versioned_objects_strict
    obj.__strict_check_before_flush__()
  File "/Users/mvandenb/src/galaxy/lib/galaxy/model/__init__.py", line 5272, in __strict_check_before_flush__
    raise Exception(
Exception: HistoryDatasetAssociation <galaxy.model.HistoryDatasetAssociation(34) at 0x32f293820> in state error with null file size, this is not valid

And this is being repeated across all tests, so I think we keep trying to schedule if we don't also fail the invocation.
The other thing is that we need to set the file size before we commit with a terminal state.

lib/galaxy/workflow/scheduling_manager.py Outdated Show resolved Hide resolved
lib/galaxy/workflow/scheduling_manager.py Outdated Show resolved Hide resolved
@jmchilton
Copy link
Member Author

Gotcha. I will redo this to fail the invocation on materialization error.

@jmchilton
Copy link
Member Author

@mvdbeek The workflow failure stuff is so cool - let me know if I used it right. I guess we might want a more specific "reason" code.

if not self.__attempt_materialize(workflow_invocation, session):
return None
if self.app.config.workflow_scheduling_separate_materialization_iteration:
return None
Copy link
Member

@mvdbeek mvdbeek Oct 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this have any measurable effect ? I would imagine materialization of even a small dataset takes so much longer than scheduling a workflow. As I understand this, we now spend a lot of time materializing a number of inputs (unless deferred), and then based on this setting we proceed (or not) to schedule the workflow.

@mvdbeek
Copy link
Member

mvdbeek commented Oct 1, 2024

We can always get more specific later on, that part is fine. I would totally like to merge this now, but I worry that we might make scheduling performance very unpredictable if we undefer in the scheduling loop. I don't think it's a ton of work to move that into celery, I'll give that a crack .. worst case we only allow deferred inputs in the landing API ?

@jmchilton
Copy link
Member Author

“Premature optimization is the root of all evil” -Tony Hoare. We should let it kill a handler before we worry - we have no clue if the feature will ever be used or if it would be a problem for the use cases we have in mind. Deferred data, download caches, no one using the feature... all might negate the value in that effort.

@mvdbeek
Copy link
Member

mvdbeek commented Oct 1, 2024

I see this more from the angle, can one unprivileged user break scheduling for everyone else? My experience with these niche features is that they will eventually be used, and at that point admins will not know why their schedulers are stuck.

@mvdbeek mvdbeek merged commit 2ead0c4 into galaxyproject:dev Oct 1, 2024
56 checks passed
@jmchilton
Copy link
Member Author

I've written many niche features for Galaxy no one has ever used 😅. First sign of trouble and I will offer to redo it in Celery.

@jdavcs jdavcs added the highlight Included in user-facing release notes at the top label Nov 20, 2024
@jdavcs jdavcs added highlight/dev Included in admin/dev release notes and removed highlight Included in user-facing release notes at the top labels Dec 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants